The northeast region of the continental United States is an area
of high population density, a pattern that persists among various states
including Delaware. The Delaware Water Gap National Recreation Area is
unique in that it is situated between two major interstates, the I-80
and the I-84. With a growing global population and sprawling
urbanization, understanding the influence of development within close
proximity to national protected areas is important for future urban
planning. For this analysis, we chose to focus on point count data for
four species of woodpeckers: downy woodpecker, pileated woodpecker,
red-bellied woodpecker, and hairy woodpecker. Since woodpeckers reside
in Delaware year-round, they are an appropriate species to chose when
focusing on local environmental variables over time.
Ambient noise levels, the presence of road noise, and the hemlock tree condition score data are used as possible explanatory variables for the detection count of the woodpecker species. Ambient noise levels and the presence of road noise are used as indicators of human influence. Hemlock tree condition scores are representative of the condition of hemlocks, as woodpeckers have been known to exploit hemlock tree snags for nesting.This study will investigate the statistical significance of these possible explanatory variables on woodpecker detection count as well as the spatial variation in these variables.
Question 1: Is there a visual correlation
between the possible explanatory variables (ambient noise, road noise,
and hemlock condition) and the detection of woodpecker species in the
Delaware Water Gap from 2014-2025?
Question 2: How does the appearance of woodpecker species, ambient noise, road noise, and hemlock condition vary spatially?
Question 3: Is there a statistical significance between the possible explanatory variables (ambient noise, road noise, and hemlock condition) and the detection of woodpecker species in the Delaware Water Gap from 2014-2025?
The dataset used in this analysis was downloaded from the Data.gov
federal database and is originally published from the National Park
Service (NPS). The data was gathered as a part of the Eastern Rivers and
Mountains Network Streamside Bird Monitoring Protocol. The dataset
includes avian point counts from the following six different National
Parks:
The data was collected during the months of May and June from 2011-2025. Each park consisted of multiple sites with each site having three point count stations. The sites were sampled repeatedly, typically four times annually. Point counts were conducted for ten minutes each, providing ample data on the bird species detected.Additional environmental factors such as ambient noise, road noise, and hemlock conditions were noted for each individual point count station by date of observation. The ambient noise was recorded in decibels through a handheld sensor, with values between 30-100 (dB). Road noise was recorded as a true or false status, with true denoting that there was evidence from the observer to suggest that road noise was present. Hemlock condition score was assigned by the observer with zero denoting that there was no hemlock trees observed within a 50-meter radius of the point count station. The scale of one to four denoted the health of the hemlocks observed with increasing decay for increased condition score.
| Information | Description |
|---|---|
| Data Publisher | Department of the Interior - National Park Service |
| Data Source | https://catalog.data.gov/dataset/streamside-bird-monitoring-data-in-eastern-rivers-and-mountains-network-parks-2011-2025-da |
| Variables Used | Unit Code, Site Name, Start Date ISO, Ambient Noise, Road Noise, Common Name, Hemlock Condition Score, Latitude, Longitude |
| Data Range | May 27th, 2011 to May 19th, 2025 |
After importing the dataset, the date column was properly
classified as a date and a year column was created for easier visual
grouping. Unnecessary columns were removed from the data frame and it
was filtered to only include bird species that were identified as the
four woodpeckers in the dataset within the Delaware Water Gap National
Recreation Area. Columns with unintuitive names were renamed. After
this, the main data frame was considered wrangled. Further wrangling
from this man data frame was used to create data frames that grouped
confounding variables associated with location and time with the
woodpecker species and possible explanatory variables. This created
detection counts for the woodpecker species dependent on environmental
factors.
The data frames were used to create averages of the ambient noise and hemlock condition scores per year and per site as well as the yearly road noise status per species. Additional data frames were created with the possible explanatory variables and their associated coordinates for mapping analysis.
The exploratory analysis of our data set is comprised of several
plots and spatial map objects that relate our explanatory variables to
woodpecker observations in the Delaware Water Gap National Park. First,
the total observation count of all four species across all sites was
plotted per year. This makes it possible to provide an understanding of
the relative frequencies of woodpecker appearance in the park, as well
as see how the population may have changed over time. Next, the
explanatory variables were graphed over time. First, the ambient noise
levels were plotted by date, to show the relative spread of the levels
of noise present when a woodpecker is spotted. To provide further
information, a line plot was produced to show the average ambient noise
of all observations. Next, we decided to provide an analysis of road
noise observations over time. The presence or absence of road noise was
provided in the data, and we wished to explore the correlation between
that and woodpecker appearance. As such, a plot was created comparing
the presence or absence of road noise per year and by species. The final
explanatory variable, hemlock condition, was graphed as an average value
over time and by species. As a discrete value that is assigned per
observation, we decided to see if there was differences each year in the
average score, in order to see how it related to woodpecker
appearance.
Figure 1. Line Plot of Species Appearance Over Time
Figure 1 graphs the total number of observations of each
woodpecker species per year.
Figure 2. Scatter Plot of Ambient Noise Over Time
Figure 2 plots all ambient noise values across all sites over
time. Values are in decibels.
Figure 3. Line Plot of Average Ambient Noise
Figure 3 graphs the average ambient noise per each year over
time, with separate lines for each of the four species.
Figure 4. Road Noise Presence or Absence Across All Sites
Figure 4 graphs all sites in regards to the presence of road
noise, with different breaks in the bar representing each of the four
woodpecker species.
Figure 5. Scatter Plot of Hemlock Condition Over Time
Figure 5 represents the average hemlock condition score across
all sites per year, by each of the four species.
Figure 6. Comparison of Hemlock Condition and Detection Count
Figure 6 compares hemlock condition score and the number of
detections of each of the four woodpecker species. Visually, a score of
three is the most common observation.
After plotting woodpecker appearance and the explanatory
variables over time, our objective was to show the observation sites and
frequency of woodpecker observations spatially. To begin, we have
provided a mapview object with two layers, one with every observation
site in the data, and the other showing each site with a size that
varies based on the number of observations recorded at that site. Next,
we wanted to provide readers with an understanding of how the
explanatory variables differ spatially. As such, we created another
mapview object, with layers representing average ambient noise per site,
sites with road noise, sites without road noise, and the average hemlock
score per site.
Map 1. Map of All Monitoring Sites in the Delaware Water Gap National Recreation Area
Map 2. Map of Observations of the Four Woodpecker Species in the Delaware Water Gap Recreation Area
Map 3. Map of Average Ambient Noise Levels (dB) and Average Hemlock Score Across all Sites
Map 4. Map of Sites with Observations of Road Noise and Without Road Noise
With the data wrangling and exploratory analysis complete, we
can now run some statistical analysis on each of the explanatory
variables to see if they have any affect on woodpecker sightings. The
explanatory variables are different from each other and need to be
analyzed in different ways due to their nature. To measure if hemlock
conditions influence woodpecker sightings, we ran a one-way ANOVA test
because the hemlock data is categorical with more than two conditions.
We also ran a post-hoc test using Tukey Honest Significant Difference
test to see if any hemlock condition guarantees a certain level of
woodpecker sighting. To analyze the impact of ambient noise, we used a
singular linear regression because ambient noise is a single continuous
explanatory variable. We also graphed the results with a lm line of best
fit to illustrate how ambient noise and woodpecker sightings are
related. The way we analyzed how road noise impacted woodpecker
sightings was by running a t-test because it is the only explanatory
variable with only two categories. To visualize the affects of road
noise on woodpecker sightings we also graphed sighting occurances based
on if road noise was present or not. Once we run these tests we will be
able to determine if these variables have any impact of woodpecker
sightings.
Figure 7.
This is because the p-value is
0.0236 which is below the 0.05 threshold. However, when we run a HSD
test on the data, there is no significant combination of hemlock
condition and woodpeckers spotted that is significant, all of their
p-values were above 0.05.
Figure 8. Woodpecker Sightings Based on Ambient Noise Level
We can see that our p value is 0.04804 which is below 0.05,
meaning that we reject the null hypothesis and accept the alternative.
The alternative hypothesis states that the true correlation between our
variables is not equal to 0 which is true, the estimated correlation is
-0.08991391.
Figure 9. Woodpecker Sightings Based on Road Noise
The p-value of our t-test was 0.9768 which is well above the
threshold of 0.05. The mean of the group with road noise was 1.793478
and the mean of the group without road noise was 1.798206.
While we learned much from our study and are ready to answer our questions, we must acknowledge some discrepancy or potential knowledge gaps in our data and its collection. The data was only collected in the summertime and with site overlap. So we are provided only a small window into what woodpecker populations look like in the Delaware Water Gap National Recreation Area when they are present year round. Along with many sites overlapping, the data collection was also totally reliant on the observers seeing or hearing the birds and birds could have been at the site but not seen by the observer. The data collection team also had a limited staff so they were not able to observe every site simultaneously. All of these factors could have lead to sightings being missed or misrepresented in the data. Another factor that could have caused discrepancy is that road noise could be a part of ambient noise at sites. It is not specified if road noise is excluded from ambient noise levels.
From the data, we can see that the downy woodpecker is the most common woodpecker species in the area, with hairy, pileated, and red-bellied appearing in slightly lesser levels. There did not appear to be a woodpecker species that appeared at different ambient noise levels, they all appear at relatively the same ambient noise average as every other species every. There was no observable trend and no species was dominant at different ambient noise levels. There also does not appear to be an observable visual trend in road noise presence in woodpecker sightings. There was a difference in the number of observations that had road noise presence and those that did not. However this could have been attributed to the lack of roads in the near vicinity of many sites. Our graphs determined that there is a observable trend for the preferred hemlock condition. Woodpeckers prefer a hemlock score of 3. A hemlock score of this level indicates that woodpeckers prefer hemlocks with green and living branches near the crown and dead branches near the base. These conditions seem to be the most ideal for every woodpecker species.
There were also observable trends when we examined the spatial extent of woodpecker observations. The downy woodpecker mainly occupies the northern part of the recreation area. The hairy woodpecker seemed to occupy the north and middle of the area. The Red-bellied Woodpecker was seen more in the southern and middle sections of the area. While the pileated woodpecker was present at similar levels all over the area. The sites with the most woodpecker observations are along the middle and the north of the recreation area, the southern areas do not see as many woodpecker sights. Ambient noise levels seem to be fairly even across the whole recreation area, with areas closer to roads and the airport having higher average ambient noise levels. There is concentrations of site with road noise, they appear to be occurring in several small collections along the highway that cuts through the recreation area. Hemlock condition scores also seem to be fairly evenly distributed across the recreation area. There is a small concentration of poor hemlock condition score within the actual Delaware Water Gap, but outside of that, there is no other concentration of any specific score.
When we examined our data statistically, we were able to draw conclusions about what explanatory variable is significant to woodpecker sightings. When we ran our ANOVA test for hemlock condition score it returned a p value below 0.05, which proves that hemlock condition score is an explanatory variable. The results of our singular linear regression indicated a p value below 0.05, which means we must accept the alternative hypothesis. The alternative hypothesis states that the correlation between woodpecker sightings and ambient noise levels are correlated inversely. Rising ambient noise levels slightly decrease the number of woodpecker sightings. This is due to the fact that our correlation is not much lower than 0. When we analyzed the effects of road noise on woodpecker sightings, the p-value was well above 0.05. This means that we fail to reject our null hypothesis, meaning there is no difference in the average number of observations between road noise and no road noise groups. The difference between these two groups is incredibly small, further backing up the results of the t-test. Overall the results of this test indicate that there is no correlation between woodpecker sightings and road noise.
For future consideration, we would advise that when ambient noise observations are being made, it would be insightful to specify if road noise was considered a part of the ambient noise level. A larger observation staff to observe the sites year round would also be more beneficial, as all four species are non-migratory. Finally, the site locations could be more evenly distributed across the recreation area, so that they do not overlap.